Intermediate mathematics/Discrete mathematics
__TOC__ = Discrete mathematics = Set theory :See also: , , , , , , , , and \varnothing is the empty set (the additive identity) \mathbf{U} is the universe of all elements (the multiplicative identity) a \in A means that is a (or member) of set . In other words a is in A. : \{ x \in \mathbf{A} : x \notin \mathbb{R} \} means the set of all x's that are members of the set such that x is not a member of the . Could also be written \{ \mathbf{A} - \mathbb{R} \} A does not allow multiple instances of an element. \{1,1,2\} = \{1,2\} :A does allow multiple instances of an element. \{1,1,2\} \neq \{1,2\} A set can contain other sets. \{1,\{2\},3\} \neq \{1,2,3\} A \subset B means that is a of : A \subseteq A means that is a of itself. But a set is not a of itself. A \cup B is the of the sets and . In other words, \{A+B\} : \{1,2\}+\{2,3\}=\{1,2,3\} A \cap B is the of the sets and . In other words, \{A \cdot B\} All a's in B. :Associative: A \cdot \{B \cdot C\} = \{A \cdot B\} \cdot C :Distributive: A \cdot \{B + C\}=\{A \cdot B\} + \{A \cdot C\} :Commutative: \{A \cdot B\} =\{B \cdot A\} A \setminus B is the of and . In other words, \{A - A \cdot B\} : \overline{A} or A^c = \{U - A\} is the of A. A \bigtriangleup B or A \ominus B is the of sets and which is the set of all objects that are a members of either or but not in both. : A \bigtriangleup B = (A + B) - (A \cdot B) = (A - A \cdot B) + (B - A \cdot B) A \times B is the of and which is the set whose members are all possible where is a member of and is a member of . The of a set is the set whose members are all of the possible subsets of . A of a set X is a collection of sets whose union contains X as a subset.Wikipedia:Cover (topology) A subset A of a topological space X is called (in X) if every point x in X either belongs to A or is arbitrarily "close" to a member of A. :A subset A of X is if it can be expressed as the union of countably many nowhere dense subsets of X. of sets A_0 = {1, 2, 3} and A_1 = {1, 2, 3} can be computed by finding: : \begin{align} A^*_0 & = \{(1, 0), (2, 0), (3, 0)\} \\ A^*_1 & = \{(1, 1), (2, 1), (3, 1)\} \end{align} so : A_0 \sqcup A_1 = A^*_0 \cup A^*_1 = \{(1, 0), (2, 0), (3, 0), (1, 1), (2, 1), (3, 1)\} Let H'' be the subgroup of the integers (''mZ', +) = ({..., −2''m, −''m'', 0, m'', 2''m, ...}, +) where m'' is a positive integer. :Then the of ''H are the ''mZ' + a'' = {..., −2''m+''a'', −''m''+''a'', a'', ''m+''a'', 2''m''+''a'', ...}. :There are no more than m'' cosets, because ''mZ''' + m'' = ''m('''Z + 1) = m'Z'. :The coset (m'Z' + a'', +) is the of ''a modulo m''.Joshi p. 323 :Cosets are not usually themselves subgroups of G, only subsets. \exists means "there exists at least one" \exists! means "there exists one and only one" \forall means "for all" \land means "and" (not to be confused with ) \lor means "or" (not to be confused with ) Back to top Probability \vert A \vert is the of A which is the number of elements in A. See . P(A) = {\vert A \vert \over \vert U \vert} is the unconditional that A will happen. P(A \mid B) = {\vert A \cdot B \vert \over \vert B \vert} is the that A will happen given that B has happened. P(A + B) = P(A) + P(B) - P(A \cdot B) means that the probability that ''A or B'' will happen is the probability of ''A plus the probability of B'' minus the probability that both ''A and B'' will happen. P(A \cdot B) = P(A \cdot B \mid B)P(B) = P(A \cdot B \mid A)P(A) means that the probability that ''A and B'' will happen is the probability of "A and B given B" times the probability of B. P(A \cdot B \mid B) = \frac{P(A \cdot B \mid A) \, P(A)}{P(B)}, is If you dont know the certainty then you can still know the probability. If you dont know the probability then you can always know the Bayesian probability. The Bayesian probability is the degree to which you expect something. Even if you dont know anything about the system you can still know the Bayesian probability. As new information comes in the is updated and replaced with the by using . From Wikipedia:Base rate fallacy: In a city of 1 million inhabitants let there be 100 terrorists and 999,900 non-terrorists. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic facial recognition software. 99% of the time it behaves correctly. 1% of the time it behaves incorrectly, ringing when it should not and failing to ring when it should. Suppose now that an inhabitant triggers the alarm. What is the chance that the person is a terrorist? In other words, what is P(T | B), the probability that a terrorist has been detected given the ringing of the bell? Someone making the 'base rate fallacy' would infer that there is a 99% chance that the detected person is a terrorist. But that is not even close. For every 1 million faces scanned it will see 100 terrorists and will correctly ring 99 times. But it will also ring falsely 9,999 times. So the true probability is only 99/(9,999+99) or about 1%. relates to the act of '''arranging' all the members of a into some or . The number of permutations of n'' distinct objects is .Wikipedia:Permutation :A derangement is a permutation of the elements of a set, such that no element appears in its original position. In other words, derangement is a permutation that has no . The number of of a set of size n'', usually written , is called the "derangement number" or "de Montmort number".Wikipedia:derangement ::The are a triangular array of integers that enumerate permutations of the set { 1, ..., n'' } with specified numbers of fixed points: in other words, '''partial derangements'.Wikipedia:rencontres numbers a is a selection of items from a collection, such that the order of selection does not matter. For example, given three numbers, say 1, 2, and 3, there are three ways to choose two from this set of three: 12, 13, and 23. More formally, a k''-'''combination' of a S'' is a subset of ''k distinct elements of S''. If the set has ''n elements, the number of k''-combinations is equal to the : \binom nk = \textstyle\frac{n!}{k!(n-k)!}. Pronounced n choose k. The set of all ''k-combinations of a set S'' is often denoted by \textstyle\binom Sk . The '''central limit theorem' (CLT) establishes that, in most situations, when are added, their properly normalized sum tends toward a (informally a "bell curve") even if the original variables themselves are not normally distributed.Wikipedia:Central limit theorem In , the standard deviation (SD, also represented by the Greek letter sigma or the Latin letter s) is a measure that is used to quantify the amount of variation or of a set of data values. A low standard deviation indicates that the data points tend to be close to the (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.Wikipedia:standard deviation The is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, without replacement, from a finite population of size N that contains exactly K objects with that feature, wherein each draw is either a success or a failure. :In contrast, the describes the probability of k successes in n draws with replacement.Wikipedia:Hypergeometric distribution is used to model the risk of extreme, rare events, such as the 1755 Lisbon earthquake. It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed. See also , , Back to top Logic From Wikipedia:Inductive reasoning Given that "if A'' is true then the laws of cause and effect would cause ''B, C'', and ''D to be true", :An example of deduction would be ::"A'' is true therefore we can deduce that ''B, C'', and ''D are true". :An example of induction would be ::"B'', ''C, and D'' are observed to be true therefore ''A might be true". ::A'' is a reasonable explanation for ''B, C'', and ''D being true. For example: :A large enough asteroid impact would create a very large crater and cause a severe impact winter that could drive the dinosaurs to extinction. ::We observe that there is a very large crater in the Gulf of Mexico dating to very near the time of the extinction of the dinosaurs ::Therefore this impact is a reasonable explanation for the extinction of the dinosaurs. The Temptation is to jump to the conclusion that this must therefore be the true explanation. But this is not necessarily the case. The Deccan Traps in India also coincide with the disappearance of the dinosaurs and could have been the cause their extinction. A classical example of an incorrect inductive argument was presented by John Vickers: : All of the swans we have seen are white. : Therefore, we know that all swans are white. The correct conclusion would be, "We expect that all swans are white". As a logic of induction, Bayesian inference does not determine which beliefs are a priori rational, but rather determines how we should rationally change the beliefs we have when presented with evidence. We begin by committing to a prior probability for a hypothesis based on logic or previous experience, and when faced with evidence, we adjust the strength of our belief (expectation) in that hypothesis in a precise manner using Bayesian logic. Unlike deductive arguments, inductive reasoning allows for the possibility that the conclusion is false, even if all of the premises are true. Instead of being valid or invalid, inductive arguments are either strong or weak, which describes how probable it is that the conclusion is true. It is often said that when thinking subjectively you will see whatever you want to see. In fact this is always the case. It's just that if you truly want to see what the facts say when they are allowed to speak for themselves then you will see that too. This is called "objectivity". It is man's capacity for objective reasoning that separates him from the animals. None of us are all-human though. All of us have a little bit of ego that tries to think subjectively. (Reality is not the egos native habitat.) Back to top Morphisms :See also: and Every , f(x), has exactly one output for every input. If its , f−1(x), has exactly one output for every input then the function is . :: f^{-1}(f(x))=x :If it isn't invertible then it doesn't have an inverse function. An which is a function that is its own inverse function. Example: f(x)=x/(x-1) :: f(f(x))=x A is exactly the same as a function but in every morphism has an inverse which is allowed to have more than one value or no value at all. consist of: :Objects (usually ) : (usually ) possessing: ::one source object (domain) ::one target object (codomain) a morphism is represented by an arrow: : f(x)=y is written f : x \to y where x is in X and y is in Y. : g(y)=z is written g : y \to z where y is in Y and z is in Z. The of y is z. The (or ) of z is the set of all y whose image is z and is denoted g^{-1}z A space Y is a (a fiber bundle) of space Z if the map g : y \to z is locally . :A covering space is a if it is . ::The concept of a universal cover was first developed to define a natural domain for the of an . :::The general theory of analytic continuation and its generalizations are known as . ::::The set of can be considered to be the analytic continuation of an analytic function. A topological space is if no part of it is disconnected. A space is if there are no holes passing all the way through it (therefore any loop can be shrunk to a point) :See Composition of morphisms: : g(f(x)) is written g \circ f ::f is the of g ::f is the of g \circ f ::? is the of ? A is a map from one set to another of the same type which preserves the operations of the algebraic structure: : f(x \cdot y) = f(x) \cdot f(y) : f(x + y) = f(x) + f(y) :See ::A is a homomorphism with a domain in one category and a codomain in another. :A from (G'', ∗) to (''H, ·) is a h'' : ''G → H'' such that :: h(u*v) = h(u) \cdot h(v) = h© for all u*v = c in G. ::For example log(a*b) = log(a) + log(b) :::Since log is a homomorphism that has an inverse that is also a homomorphism, log is an of groups. The is a of the multiplicative group of \mathbb{R}^+ to the of real numbers, \mathbb{R} . :See also and A has morphisms with more than one source object. A f(v_1,\ldots,v_n) = W : : f\colon V_1 \times \cdots \times V_n \to W\text{,} has a corresponding : F(v_1\otimes \cdots \otimes v_n) = W : : F\colon V_1 \otimes \cdots \otimes V_n \to W\text{,} Back to top Numerical methods :See also: :From Wikipedia:Numerical analysis: One of the simplest problems is the evaluation of a function at a given point. The most straightforward approach, of just plugging in the number in the formula is sometimes not very efficient. For polynomials, a better approach is using the , since it reduces the necessary number of multiplications and additions. Generally, it is important to estimate and control arising from the use of arithmetic. solves the following problem: given the value of some unknown function at a number of points, what value does that function have at some other point between the given points? is very similar to interpolation, except that now we want to find the value of the unknown function at a point which is outside the given points. is also similar, but it takes into account that the data is imprecise. Given some points, and a measurement of the value of some function at these points (with an error), we want to determine the unknown function. The -method is one popular way to achieve this. Much effort has been put in the development of methods for solving . :Standard direct methods, i.e., methods that use some :: , , for (or ) and , and for non-square matrices. : :: , , and are usually preferred for large systems. General iterative methods can be developed using a . are used to solve nonlinear equations. :If the function is and the derivative is known, then is a popular choice. : is another technique for solving nonlinear equations. problems ask for the point at which a given function is maximized (or minimized). Often, the point also has to satisfy some . : If you set up 100 fans to blow air from one end of the room to the other and then you drop a feather into the wind, what happens? The feather will follow the air currents, which may be very complex. One approximation is to measure the speed at which the air is blowing near the feather every second, and advance the simulated feather as if it were moving in a straight line at that same speed for one second, before measuring the wind speed again. This is called the for solving an ordinary differential equation. Back to top Information theory From Wikipedia:Information theory: Information theory studies the quantification, storage, and communication of information. Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. From Wikipedia:Quantities of information: Shannon derived a measure of information content called the ' ' or '"surprisal"' of a message ''m: : I(m) = \log \left( \frac{1}{p(m)} \right) = - \log( p(m) ) \, where p(m) = \mathrm{Pr}(M=m) is the probability that message m is chosen from all possible choices in the message space M . The base of the logarithm only affects a scaling factor and, consequently, the units in which the measured information content is expressed. If the logarithm is base 2, the measure of information is expressed in units of . Information is transferred from a source to a recipient only if the recipient of the information did not already have the information to begin with. Messages that convey information that is certain to happen and already known by the recipient contain no real information. Infrequently occurring messages contain more information than more frequently occurring messages. This fact is reflected in the above equation - a certain message, i.e. of probability 1, has an information measure of zero. In addition, a compound message of two (or more) unrelated (or mutually independent) messages would have a quantity of information that is the sum of the measures of information of each message individually. That fact is also reflected in the above equation, supporting the validity of its derivation. An example: The weather forecast broadcast is: "Tonight's forecast: Dark. Continued darkness until widely scattered light in the morning." This message contains almost no information. However, a forecast of a snowstorm would certainly contain information since such does not happen every evening. There would be an even greater amount of information in an accurate forecast of snow for a warm location, such as Miami. The amount of information in a forecast of snow for a location where it never snows (impossible event) is the highest (infinity). The more surprising a message is the more information it conveys. The message "LLLLLLLLLLLLLLLLLLLLLLLLL" conveys exactly as much information as the message "25Ls". The first message which is 25 bytes long can therefore be "compressed" into the second message which is only 4 bytes long. Back to top Early computers :Python 3.6 for beginners, https://repl.it/languages, https://www.wolframalpha.com :See also: * * * * * * * * * * * * * * * Back to top Tactical thinking :See also :From Wikipedia:Game theory: In the accompanying example there are two players; Player one (blue) chooses the row and player two (red) chooses the column. Each player must choose without knowing what the other player has chosen. The payoffs are provided in the interior. The first number is the payoff received by Player 1; the second is the payoff for Player 2. Tit for tat is a simple and highly effective tactic in game theory for the iterated prisoner's dilemma. An agent using this tactic will first cooperate, then subsequently replicate an opponent's previous action. If the opponent previously was cooperative, the agent is cooperative. If not, the agent is not.Wikipedia:Tit for tat In zero-sum games the sum of the payoffs is always zero (meaning that a player can only benefit at the expense of others). Cooperation is impossible in a zero-sum game. John Forbes Nash proved that there is a Nash equilibrium (an optimum tactic) for every finite game. In the zero-sum game shown to the right the optimum tactic for player 1 is to randomly choose A or B with equal probability. Strategic thinking differs from tactical thinking by taking into account how the short term goals and therefore optimum tactics change over time. For example the opening, middlegame, and endgame of chess require radically different tactics. See also: Back to top Next section: Intermediate mathematics/Physics